Pace job starts and raise rate-limit retries#394
Conversation
|
Updates to Preview Branch (work/quirky-wilbur-be0a2f) ↗︎
Tasks are run on every commit but only new migration files are pushed.
View logs for this Workflow Run ↗︎. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe PR increases the default job retry cap for rate-limited failures from 3 to 8 (via configurable ChangesJob Scheduler Rate-Limiting and Pacer Configuration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Release VersionsApp patch: ChangelogChanged
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/jobs/types.go (1)
1-149: ⚡ Quick winRun
gofmtandgoimportson this file.As per coding guidelines, touched Go files should be formatted with
gofmtandgoimports.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/jobs/types.go` around lines 1 - 149, This file is not formatted per project Go style; run gofmt (or gofmt -s) and goimports on this file to fix spacing, import ordering, and unused-imports issues, then re-run `go vet`/build to ensure no unused imports remain; focus changes will appear around the package/import block and declarations such as JobStatus/TaskStatus constants and structs Job, Task, JobOptions, and QuotaExceededError—commit the formatted file.internal/jobs/executor.go (1)
1-549: ⚡ Quick winRun
gofmtandgoimportson this file.As per coding guidelines, touched Go files should be formatted with
gofmtandgoimports.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/jobs/executor.go` around lines 1 - 549, The file internal/jobs/executor.go is not formatted per project policy; run gofmt and goimports on it to fix whitespace, indentation and import ordering/unused imports (this will update imports used by symbols like TaskExecutor, Execute, ConstructTaskURL, buildHTMLUpload, gzipHTML, canonicalHTMLContentType, normalisedHTMLContentType, etc.). Use gofmt (or go fmt) to normalize formatting and goimports to group and remove unused imports so the file compiles and matches repository style. After running both tools, re-run go vet/build to ensure no new import errors remain and commit the formatted file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@internal/jobs/executor.go`:
- Around line 1-549: The file internal/jobs/executor.go is not formatted per
project policy; run gofmt and goimports on it to fix whitespace, indentation and
import ordering/unused imports (this will update imports used by symbols like
TaskExecutor, Execute, ConstructTaskURL, buildHTMLUpload, gzipHTML,
canonicalHTMLContentType, normalisedHTMLContentType, etc.). Use gofmt (or go
fmt) to normalize formatting and goimports to group and remove unused imports so
the file compiles and matches repository style. After running both tools, re-run
go vet/build to ensure no new import errors remain and commit the formatted
file.
In `@internal/jobs/types.go`:
- Around line 1-149: This file is not formatted per project Go style; run gofmt
(or gofmt -s) and goimports on this file to fix spacing, import ordering, and
unused-imports issues, then re-run `go vet`/build to ensure no unused imports
remain; focus changes will appear around the package/import block and
declarations such as JobStatus/TaskStatus constants and structs Job, Task,
JobOptions, and QuotaExceededError—commit the formatted file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 80e771cd-8deb-4f76-b883-cfcfadfc5c59
📒 Files selected for processing (6)
CHANGELOG.mddocs/operations/ENV_VARS.mdinternal/jobs/executor.gointernal/jobs/pacer_seed_test.gointernal/jobs/stream_worker.gointernal/jobs/types.go
|
🐝 Review App Deployed Homepage: https://hover-pr-394.fly.dev |
Summary
Two independent, cautious changes to how the crawler handles target-domain rate limiting (429/403/503). Motivated by a 10k-task load test where 44% of tasks hard-failed, almost entirely on HTTP 429 — DB and infra were healthy, so the dominant limiter on successful throughput is target-domain throttling, not infrastructure.
1. Raise blocking retries (deferred, not failed)
MaxBlockingRetrieswas hardcoded to 3, so a rate-limited task burned all attempts in <1s (backoff 100/200/400ms) and hard-failed while the domain was still throttling.GNH_RATE_LIMIT_MAX_RETRIES(previously documented inENV_VARS.mdbut never wired) with a raised default of 8, so a persistently-throttling domain is paced-and-deferred across the widening per-domain delay (up to the 60s cap) rather than counted as a failure.2. Pace every job start (no first-wave burst)
GNH_PACER_START_FLOOR_DELAY_MS, default 500ms,0restores prior behaviour). A previously-crawled domain now starts at ~3 concurrent requests and widens on success, instead of bursting. A never-crawled domain keeps the cautious 2000ms warm-up (~1 worker). A domain with a higher learned delay is unchanged.seedAdaptiveDelayMS(...)and unit-tested.Resulting job-start ramp
The cap then widens to full job concurrency via the existing success step-down.
Reviewer notes
GNH_PACER_START_FLOOR_DELAY_MS=0to revert Part 2.Test plan
go build ./...go test ./internal/jobs/ ./internal/broker/seedAdaptiveDelayMS(warm-up, floor, disabled-floor cases)Need help on this PR? Tag
@codesmithwith what you need. Autofix is disabled.Summary by CodeRabbit
Improvements
Documentation
Tests